Stylochronometry: Timeline Prediction in Stylometric Analysis

نویسندگان

  • Carmen Klaussner
  • Carl Vogel
چکیده

We examine stylochronometry, the question of measuring change in linguistic style over time within an authorial canon and in relation to change in language in general use over a contemporaneous period. We take the works of two prolific authors from the 19th/20th century, Henry James and Mark Twain, and identify variables that change for them over time. We present a method of analysis applying regression on linguistic variables in predicting a temporal variable. In order to identify individual authors’ effects on the model, we compare the model based on the novelists’ works to a model based on a 19th/20th century American English reference set. We evaluate using R2 and Root mean square error (RMSE), that indicates the average error on predicting the year. On the two-author data, we achieve an RMSE of ±7.2 years on unseen data (baseline: ±13.2); for the larger reference set, our model obtains an RMSE of ±4 on unseen data (baseline: ±17).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Author Profiling using Stylometric and Structural Feature Groupings

In this paper we present an approach for the task of author profiling. We propose a coherent grouping of features combined with appropriate preprocessing steps for each group. The groups we used were stylometric and structural, featuring among others, trigrams and counts of twitter specific characteristics. We address gender and age prediction as a classification task and personality prediction...

متن کامل

A Framework for Stylometric Similarity Detection in Online Settings

Online marketplaces and communication media such as email, web sites, forums, and chat rooms have been ubiquitously integrated into our everyday lives. Unfortunately, the anonymous nature of these channels makes them an ideal avenue for online fraud, hackers, and cybercrime. Anonymity and the sheer volume of online content make cyber identity tracing an essential yet strenuous endeavor for Inte...

متن کامل

Stylometric Analysis of Bloggers' Age and Gender

We report results of stylometric differences in blogging for gender and age group variation. The results are based on two mutually independent features. The first feature is the use of slang words which is a new concept proposed by us for Stylometric study of bloggers. Slang is a non-dictionary word that has evolved with time due to its frequent and popular usage. For the second feature, we hav...

متن کامل

Moving Beyond Monitoring ... PDQ (Pretty Damn Quick)

Performance management can be broken into three sequential processes: performance monitoring, performance analysis, and performance modeling. (Fig. 1) Monitoring is the theme of this issue, analysis refers to the capability of looking for patterns in monitored data that reside in a database, while modeling attempts to use monitored data to predict future events, such as resource bottlenecks. PD...

متن کامل

Mental Timeline in Persian Speakers’ Co-speech Gestures based on Lakoff and Johnson’s Conceptual Metaphor Theory

One of the introduced conceptual metaphors is the metaphor of "time as space". Time as an abstract concept is conceptualized by a concrete concept like space. This conceptualization of time is also reflected in co-speech gestures. In this research, we try to find out what dimension and direction the mental timeline has in co-speech gestures and under the influence of which one of the metaphoric...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015